Tutorial

From Fusion Gene Study
Jump to: navigation, search
Fusion Gene Study-Tutorial


Data preparation of SRR064286

  1. Download the read sequences from:
 ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR064/SRR064286/ 
Or
 ftp://ftp.ddbj.nig.ac.jp/ddbj_database/dra/fastq/SRA023/SRA023117/SRX025827

De Novo assembly based method

Install and configure programs

  1. Install wine and mono package through Software Manager of the Linux Mint OS.
  2. Download and extract the packages EBARDenovo and TranslocCheck from http://sourceforge.net/projects/ebardenovo/.
  3. Download and extract the gmap-gsnap package from http://research-pub.gene.com/gmap/. In the software repositories of Linux Mint, there is an old version of the gmap-gsnap package [11]. Since the gmap program had changed the format of reference database after 2014-03-28, the usage of the old versions are not suggested.
  4. Install zlib-dev package through Software Manager of the Linux Mint OS. It’s required for the compilation of the gmap program.
  5. Run the following commands to install the gmap program
 ./configure
 make
 sudo make install
6.Download the chromosome files *.gz from ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/ and extract them into a folder and run the command:
 sudo gmap_build -d hg19 -k 15 *.fa
7.Install UGENE package through Software Manager of the Linux Mint OS [12]. UGENE is used to display the alignment of chimeric transcripts.

Run EBARDenovo and GMAP programs

Using original compressed read files
 mono EBARDenovo.exe SRR064286_1.fastq.gz SRR064286_2.fastq.gz -o ebar286.fa -Q
 gmap -d hg19 --split-output=map_ebar_286 ebar286.fa
 gmap -d hg19 --split-output=map_chix_286 ebar286-chimera.fa
 mono TranslocCheck.exe SRR064286_1.fastq.gz SRR064286_2.fastq.gz -c ebar286.fa -t map_ebar_286.transloc
 mono TranslocCheck.exe SRR064286_1.fastq.gz SRR064286_2.fastq.gz -c ebar286-chimera.fa -t map_chix_286.transloc -S
Using uncompressed read files
 mono EBARDenovo.exe SRR064286_1.fastq SRR064286_2.fastq -o ebar286.fa -Q
 gmap -d hg19 --split-output=map_ebar_286 ebar286.fa
 gmap -d hg19 --split-output=map_chix_286 ebar286-chimera.fa
 mono TranslocCheck.exe SRR064286_1.fastq SRR064286_2.fastq -c ebar286.fa -t map_ebar_286.transloc
 mono TranslocCheck.exe SRR064286_1.fastq SRR064286_2.fastq -c ebar286-chimera.fa -t map_chix_286.transloc -S

Mapping based method

Install and configure Tophat-fusion

  1. Install Tophat package. Tophat-fusion is a function of the Tophat package which has been included in the software repositories of different Linux distributions, such as Ubuntu (http://www.ubuntu.com/) or Linux Mint (http://www.linuxmint.com/). Linux users can install the Tophat package through Ubuntu Software Center or Mint Software Manager.
  2. Install three required Bioinformatics packages: blast, bowtie and samtools. All are available through Ubuntu Software Center or Mint Software Manager as well.
  3. Download and extract the Bowtie indexes of the human reference genome hg19 from ftp://ftp.cbcb.umd.edu/pub/data/bowtie_indexes/hg19.ebwt.zip .
  4. Download and extract the Ensemble gene annotation file ensGene.txt from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ensGene.txt.gz.
  5. Download and extract the RefSeq gene annotation file refGene.txt from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/refGene.txt.gz.
  6. Create the directory “~/blast/human_genomic” and download and extract the files human_genomic*.gz of blast database from ftp://ftp.ncbi.nlm.nih.gov/blast/db/.
  7. Create the directory “~/blast/nt” and download and extract the files nt*.gz of blast database from ftp://ftp.ncbi.nlm.nih.gov/blast/db/.

Run tophat program

 tophat --fusion-search -o tophat_MCF7 -p 6 --keep-fasta-order --bowtie1 --no-coverage-search -r 0 --mate-std-dev 80 --max-intron-length 100000 --fusion-min-dist 100000 --fusion-anchor-length 13 --fusion-ignore-chromosomes chrM ~/hg19/hg19 SRR064286_1.fastq SRR064286_2.fastq

Run tophat-fusion-post program

  tophat-fusion-post -p 6 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 ~/hg19/hg19